A Comparison of Normalization and Training Approaches for ASR-Dependent Speaker Identification1
نویسندگان
چکیده
In this paper we discuss a speaker identification approach, called ASR-dependent speaker identification, that incorporates phonetic knowledge into the models for each speaker. This approach differs from traditional methods for performing textindependent speaker identification, such as global Gaussian mixture modeling, that typically ignore the phonetic content of the speech signal. We introduce a new score normalization approach, called phone adaptive normalization, which improves upon our previous speaker adaptive normalization technique. This paper also examines the use of automatically generated transcriptions during the training of our speaker models. Experiments show that speaker models trained using automatically generated transcriptions achieve the same performance as models trained using manually generated transcriptions.
منابع مشابه
A comparison of normalization and training approaches for ASR-dependent speaker identification
In this paper we discuss a speaker identification approach, called ASR-dependent speaker identification, that incorporates phonetic knowledge into the models for each speaker. This approach differs from traditional methods for performing textindependent speaker identification, such as global Gaussian mixture modeling, that typically ignore the phonetic content of the speech signal. We introduce...
متن کاملA comparison of normalization techniques applied to latent space representations for speech analytics
In the context of noisy environments, Automatic Speech Recognition (ASR) systems usually produce poor transcription quality which also negatively impact performance of speech analytics. Various methods have then been proposed to compensate the bad effect of ASR errors, mainly by projecting transcribed words in an abstract space. In this paper, we seek to identify themes from dialogues of teleph...
متن کاملRobustness in ASR: An Experimental Study of the Interrelationship between Discriminant Feature-Space Transformation, Speaker Normalization and Environment Compensation
This thesis addresses the general problem of maintaining robust automatic speech recognition (ASR) performance under diverse speaker populations, channel conditions, and acoustic environments. To this end, the thesis analyzes the interactions between environment compensation techniques, frequency warping based speaker normalization, and discriminant feature-space transformation (DFT). These int...
متن کاملMfcc and Cmn Based Speaker Recognition in Noisy Environment
The performance of automatic speaker recognition (ASR) system degrades drastically in the presence of noise and other distortions, especially when there is a noise level mismatch between the training and testing environments. This paper explores the problem of speaker recognition in noisy conditions, assuming that speech signals are corrupted by noise. A major problem of most speaker recognitio...
متن کاملASR Dependent Techniques for Speaker Recognition
This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for speaker recognition, which are based on Gaussian Mixture Models (GMMs) trained globally over all speech from a given speaker. ...
متن کامل